Search CORE

8 research outputs found

Deep Autoencoder for Combined Human Pose Estimation and body Model Upscaling

Author: C Dong
C Ionescu
M Loper
M Sanzari
P Felzenszwalb
P Huang
S Abrahamsson
S Hochreiter
T Marcard von
U Schmidt
WT Freeman
Publication venue
Publication date: 04/07/2018
Field of study

We present a method for simultaneously estimating 3D human pose and body shape from a sparse set of wide-baseline camera views. We train a symmetric convolutional autoencoder with a dual loss that enforces learning of a latent representation that encodes skeletal joint positions, and at the same time learns a deep representation of volumetric body shape. We harness the latter to up-scale input volumetric data by a factor of

4 \times

, whilst recovering a 3D estimate of joint positions with equal or greater accuracy than the state of the art. Inference runs in real-time (25 fps) and has the potential for passive human behaviour monitoring where there is a requirement for high fidelity estimation of human body shape and pose

arXiv.org e-Print Archive

Crossref

University of Surrey

Surrey Research Insight

Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop

Author: A Kanazawa
A Newell
B Biggs
C Ionescu
M Everingham
M Loper
T Probst
T von Marcard
T-Y Lin
TJ Cashman
Publication venue: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Publication date: 01/01/2020
Field of study

We introduce an automatic, end-to-end method for recovering the 3D pose and shape of dogs from monocular internet images. The large variation in shape between dog breeds, significant occlusion and low quality of internet images makes this a challenging problem. We learn a richer prior over shapes than previous work, which helps regularize parameter estimation. We demonstrate results on the Stanford Dog dataset, an 'in the wild' dataset of 20,580 dog images for which we have collected 2D joint and silhouette annotations to split for training and evaluation. In order to capture the large shape variety of dogs, we show that the natural variation in the 2D dataset is enough to learn a detailed 3D prior through expectation maximization (EM). As a by-product of training, we generate a new parameterized model (including limb scaling) SMBLD which we release alongside our new annotation dataset StanfordExtra to the research community.GS

arXiv.org e-Print Archive

Crossref

Apollo (Cambridge)

Monocular Expressive Body Regression through Body-Driven Attention

Author: A Agarwal
A Erol
A Knapitsch
A Newell
B Egger
C Ionescu
D Anguelov
DM Gavrila
F Bogo
G Varol
HJ Lee
J Romero
K Li
L Sigal
L Sigal
M Loper
M Savva
M Zollhöfer
N Sarafianos
T Li
T von Marcard
T-Y Lin
TB Moeslund
U Iqbal
X Sun
Publication venue
Publication date: 01/01/2020
Field of study

To understand how people look, interact, or perform tasks, we need to quickly and accurately capture their 3D body, face, and hands together from an RGB image. Most existing methods focus only on parts of the body. A few recent approaches reconstruct full expressive 3D humans from images using 3D body models that include the face and hands. These methods are optimization-based and thus slow, prone to local optima, and require 2D keypoints as input. We address these limitations by introducing ExPose (EXpressive POse and Shape rEgression), which directly regresses the body, face, and hands, in SMPL-X format, from an RGB image. This is a hard problem due to the high dimensionality of the body and the lack of expressive training data. Additionally, hands and faces are much smaller than the body, occupying very few image pixels. This makes hand and face estimation hard when body images are downscaled for neural networks. We make three main contributions. First, we account for the lack of training data by curating a dataset of SMPL-X fits on in-the-wild images. Second, we observe that body estimation localizes the face and hands reasonably well. We introduce body-driven attention for face and hand regions in the original image to extract higher-resolution crops that are fed to dedicated refinement modules. Third, these modules exploit part-specific knowledge from existing face- and hand-only datasets. ExPose estimates expressive 3D humans more accurately than existing optimization methods at a small fraction of the computational cost. Our data, model and code are available for research at https://expose.is.tue.mpg.de .Comment: Accepted in ECCV'20. Project page: http://expose.is.tue.mpg.d

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Human Pose Estimation from Video and IMUs

Author: Pons-Moll G.
Rosenhahn B.
von Marcard T.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2016
Field of study

MPG.PuRe

Recovering Accurate {3D} Human Pose in the Wild Using {IMUs} and a Moving Camera

Author: Black M.
Henschel R.
Pons-Moll G.
Rosenhahn B.
von Marcard T.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Crossref

MPG.PuRe

Body Shape Privacy in Images: Understanding Privacy and Preventing Automatic Shape Extraction

Author: A Fawzi
CR Wren
D Anguelov
D Gavrila
F Bogo
G Pons-Moll
G Pons-Moll
G Wang
PA Viola
Q Sun
S Zhou
S-L Chang
SJ Oh
SJ Oh
T von Marcard
W Zhou
Yu Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/07/2020
Field of study

Modern approaches to pose and body shape estimation have recently achieved strong performance even under challenging real-world conditions. Even from a single image of a clothed person, a realistic looking body shape can be inferred that captures a users' weight group and body shape type well. This opens up a whole spectrum of applications -- in particular in fashion -- where virtual try-on and recommendation systems can make use of these new and automatized cues. However, a realistic depiction of the undressed body is regarded highly private and therefore might not be consented by most people. Hence, we ask if the automatic extraction of such information can be effectively evaded. While adversarial perturbations have been shown to be effective for manipulating the output of machine learning models -- in particular, end-to-end deep learning approaches -- state of the art shape estimation methods are composed of multiple stages. We perform the first investigation of different strategies that can be used to effectively manipulate the automatic shape estimation while preserving the overall appearance of the original image

arXiv.org e-Print Archive

CISPA – Helmholtz-Zentrum für Informationssicherheit

Crossref

MPG.PuRe

Automatic 3D virtual fitting system based on skeleton driving

Author: A Clegg
A Fuhrmann
A Jain
A Tagliasacchi
CC Wang
D Anguelov
G Pons-Moll
J Li
J Li
L Jiang
M Loper
N Werghi
N Wu
ND Cornea
OKC Au
P Guan
P Volino
S Johnson
SY Baek
T von Marcard
TK Dey
V Tan
W Xu
Y Lee
Y Tisserand
YS Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

A Survey on Human Performance Capture and Animation

Author: A Sheets
A Witkin
C Brigante
C Ionescu
C Wang
CC Wang
D Anguelov
D Han
E Drumwright
E Hsu
H Wang
H Zhu
J Hou
J Tong
J Zheng
Jinxiang Chai
KL Cheng
L Kovar
L Kovar
Lin Gao
M Dantone
M Ye
MA Borno
Ming-Ze Yuan
MR Yeadon
N Hasler
O Arikan
Shihong Xia
T Geijtenbeek
T Marcard von
T Mukai
VB Zordan
Y Chen
Y Huang
Y Kim
Yu-Kun Lai
Z Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref